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SUMMARY 


^5  This  paper  discusses  the  two-sample  test  of  location 
based  on  the  comparison  of  two  distribution  free  one-sample 
confidence  intervals  derived  from  sign  statistics.  This 
test  procedure,  first  introduced  by  Hettmansperger  (1986), 
rejects  the  null  hypothesis  of  equal  population  medians 
when  the  two  intervals  are  disjoint.  He  presents  three  dif¬ 
ferent  ways  to  select  the  two  one-sample  intervals  and  one 
choice  leads  to  Mood's  test.  All  solutions  have  the  same 
Pitman  efficiency.  This  paper  shows  that  the  choices  can 
be  distinguished  on  the  basis  of  Bahadur's  efficiency.  We 
formulate  the  problem  in  terms  of  (asymptotically)  fixed- 
width  confidence  intervals.  In  this  context  various  median 
tests  (including  Mood's  test)  arise  as  special  cases  and 
they  yield  different  performance.  The  solution  that  spe¬ 
cifies  equal  asymptotic  lengths  for  the  one-sample  inter¬ 
vals  (which  is  different  from  Mood's  test)  is  recommended. 

Some  key  words:  Bahadur  efficiency;  Fixed-width  confidence 

interval;  Pitman  efficiency;  Probability 
of  large  deviations;  Sign  statistic;  Two- 
sample  location  problem. 


1.  INTRODUCTION 


The  two-sample  test  of  location  discussed  in  this 
paper  is  based  on  the  comparison  of  two  distribution 
free  one-sample  confidence  intervals.  The  test  rejects 
the  null  hypothesis  of  equal  population  medians  if  the 
intervals  fail  to  overlap. 

More  precisely,  let  {X^ and  re“ 

present  independent  random  samples  from  the  respective 

populations  F  (•)  *  F(*-8  )  and  F_  (•)  *  F(»-0  ) 
x  y  ^ 

with  unique  medians  0x  and  0y  .  Let  denote  the 

pth  quantile  of  F  ,  0  <  p  <  1  .  We  assume  that  for 

all  p 


F ( • )  is  twice  differentiable  at  £  , 

r 

with  F'(£  )  =  f(£  )  >  0  . 

r  r 


d.i) 


Let  the  sign-interval  on  the  X-sample  be  given  by 


*  [X(dx)'X(ux)1 


(1.2) 


i,L 

where  the  endpoints  are  the  d**  and  u1"  observa- 

X  X 


tions  of  the  ordered  sample 


X  (1)  *  X(2)  *  *  X(m) 


0  <  X  <  1  . 


(1.6) 


(i)  Then  under  HQ  :  A  =  0  , 

a  =  P{UV  <  L  }  +  P{U„  <  L  }  -  2*(-z) 

x  y  y  A 

where  z  =  (1-X)  1//2z  +  Xly/2z  .  (1.7) 

x  y 

(ii)  Let  A  denote  the  length  of  the  two-sample 
confidence  interval  (1.5).  If  z and 

Zy  satisfy  the  condition  (1.7),  then  with 
probability  1 

(m+n)  1/2A  -  z/ ( (X(l-X) )  1/2f  (0) )  .  (1.8) 

We  note  that  the  two  one-sample  intervals  (for  0x  and 

8y)  have  respective  approximate  coverage  probabilities 

Y  *  1  -  2a  and  y  «  1  -  2a  .  This  follows  from 
x  x  y  y 

(1.3)  and  the  normal  approximation  to  the  binomial 
distribution. 

Now  let  a  and  X  ,  0  <  X  <  1  ,  be  given  and  de¬ 
fine  z  by  a  *  2$(-z)  .  Select  z  and  z  so  that 

x  y 

they  satisfy  (1.7).  By  (1.3)  this  determines  the  one- 
sample  sign-intervals  (that  is,  the  depths).  The  re¬ 
sulting  two-sample  test  is  of  approximate  size  a  . 
Clearly  there  are  infinitely  many  choices  for  zx  and 
Zy.  Hettmansperger  (1984)  discusses  three  different 
choices.  He  recommends  to  select  equal  confidence  coef¬ 
ficients  y  =  y  ,  or  equivalently  z  =  z  , 
x  y  x  y 


because 


these  z  values  are  essentially  constant  with  respect 
to  reasonable  ratios  of  sample  sizes.  More  precisely, 
by  (1.7), 


Z„  =  2 


Another  choice  leads  to  Mood's  (1950)  median  test. 
(For  a  discussion  see  Pratt  (1964)  and  Gastwirth  (1968)) 
Let,  for  simplicity,  m+n=2r,m*n.  The  Mood-in¬ 
terval  for  A  is  defined  as  follows: 


lY(d)  "  X(  (nH-n)/2-d+l)  ,Y(n-d+l)  “  X  ( (m-n) /2+d)  1  * 

This  interval  is  obtained  by  inverting  the  ?jceptance 
region  of  a  two-sided  test  based  on  the  Mood  statistic 
which  follows  a  hypergeometric  distribution  under 
Ho  :  A  «  0  .  From  the  normal  approximation  d  is  cho¬ 
sen  so  that  an  approximate  size  a  test  is  achieved. 
That  is, 

d  =  n/2  +  .5  -  z (mn/ (4 (m+n-1) ) ) (1.9) 

where  z  is  such  that  <M-z)  -  a/ 2  .  We  can  consider 
this  interval  as  being  constructed  from  two  sign-inter¬ 
vals  with  depths  d  =  d  and  dv  =  (m-n)/2  +  d  . 

y  x  y 

Statement  (1.9)  is  (asymptotically)  equivalent  to  (1.3) 


zy  ■  zX  ,  zx  =  z(l-X) 
and  the  condition  (1.7)  is  clearly  satisfied. 

The  starting  point  for  this  paper  is  the  obser¬ 
vation  that,  according  to  (1.8)  ,  all  choices  of  the 

z  and  z  lead  to  the  same  Pitman  efficiency,  as 
x  y 

long  as  (1.7)  is  satisfied.  The  choices  can  be  distin¬ 
guished,  however,  by  an  alternative  notion  which  is 
Bahadur's  efficiency.  The  analysis  of  this  efficiency 
leads  to  a  formulation  of  the  problem  in  terms  of 
(asymptotically)  fixed-width  confidence  intervals.  We 
compare  the  rates  at  which  the  Type  I  error  probabili¬ 
ties  tend  to  zero  while  the  lengths  remain  fixed  at 
(or  tend  to)  a  positive  constant.  In  this  context  the 
various  special  choices  (including  Mood's  test)  yield 
different  performance.  On  the  basis  of  this  efficiency 
criterion,  we  then  recommend  the  solution  that  speci¬ 
fies  equal  asymptotic  lengths  for  the  one-sample  in¬ 
tervals  which  is  (except  in  the  case  of  equal  sample 
sizes)  different  from  both  the  Mood  solution  and  the 
equal  confidence  coefficients  recommendation. 

In  Section  2  the  exact  size  of  the  two-sample  test 


is  derived.  In  Section  3  the  two-sample  test  procedure 
(1.4)  is  represented  in  terms  of  a  sum  statistic,  and 
the  probability  distribution  function  (under  Hq  )  of 
this  statistic  is  derived  using  an  urn  model  argument. 

A  large  deviations  result  is  obtained  and  Bahadur  effi¬ 
ciency  is  discussed  in  Section  4.  Numerical  evaluations 
and  recommendations  for  the  practitioner  are  given  in 


the  final  section. 


2.  TYPE  I  ERROR  PROBABILITY 


Under 


Hq  :  A  =  0  ,  the  and  (Y^J^ 


are  independent  random  samples  from  the  same  popula¬ 
tion  Ffl(x)  =  F (x-9 )  ,  where  F(x)  is  a  continuous 
cumulative  distribution  function  with  unique  median 
0  .  Without  loss  of  generality,  we  take  9=0.  The 
exact  size  of  the  two-sample  two-sided  test  (1.4)  is 
obtained  at  once  from  the  following  theorem. 


Theorem  2.1.  Let  X,_4  denote  the  a  ordered  ob- 
"  v  ) 

servation  from  and  let  Y(b)  denote  the  b1 

ordered  observation  from  {Y.^}*^  •  Then 


P(X(a)  <Y(b)}  ”  1  0  •  r  rn+1) - 

(a)  (b)  t=a  r(b)r(n-bfl) 


**  (b+t)  ^  (mfn+l-b-t) 
^  (n+rnfl) 


Proof.  We  note  that 


P(X(a)  <Y(b)  'p(F(x<a)>  <F(Y<b>))  '  F(ul(a)  <u2(b 


where  ~  Beta  (a,m-a+l)  ,  U2(b)  ~  Beta  (b,n-b+l) 


and  they  are  independent.  Thus, 


8 


P(Ul(a)<  U2(b)> 


i  v  r  r 

,  ,  (mfl)  a-1,,  ,m-a  (n+1)  b-l„  ^n-b,  ^ 

/  /  f - = - 1 - x  (1-x)  ■= - p - - - y  (1-y)  dxdy 

o  o  1  (a) 1  (re-a+1)  1  (b) 1  (n-bfl) 


/  (  Z  (”)yt(l-y)Ifr't)  --r-^ - yw  ~(l-y)“  ~dy 

o  t=a  1  (b) 1  (n-bfl) 


(nfl)  .b-1,,  .^n-bj 


m 

-  1 


("S  - 

V  r^r 


r(n+l)  r  (bft) r  (n+mfl-b-t) 


t=a  (b)  (n-bfl) 


(mfn+1) 


(mfn+1) 


o  ^(bft)  ^(n+mfl-b-t) 


y*^-1  { 1-y)  nfm"t>~tdy 


The  integrand  is  a  beta  probability  density  function  with 
parameters  a  =  b  +  t  and  8  =  n  +  m-  b-  t+  l  .  Hence, 
the  integral  is  1  .  ■ 


Corollary  2.1.  The  exact  size  of  the  two-sample  two-sided 
test  (1.4),  a  ,  is  given  by 


a  =  P(UX<  Ly)  +  P(Uy<  Lx) 


- 

Ct> 

y 


(dy+t) 


t=n-dy+l 


$  <dx) 
,im-n  . 
dx+t 


(dx+t) 


(2.2) 


Proof.  For  P(U._  <  L..)  ,  let  a  =  m  -  d„  +  1  ,  b  =  d..  , 

—  x  y  x  y 

apply  (2.1),  and  some  algebraic  manipulation  yields  the 
first  term  in  (2.2). 

For  P(U  <L  )  ,  first  interchange  m  with  n  in 
y  x 

(2.1)  ,  then  let  a  =  n  -  d^,  +  1  ,  b=dx  and  (2.1)  will 
after  some  algebra,  yield  the  second  term  of  (2.2).  | 

We  emphasize  that  the  size  of  the  test  depends  on 

the  depths  d  and  d  .  A  change  in  either  one  of  the 
x  y 

values  alters  the  size.  Once  d  and  d  have  been  se- 

x  y 

lected,  the  corollary  enables  us  to  compute  the  exact 

probability  of  committing  a  Type  I  error.  In  the  next 

section  we  show  that  P  (U  <  L  )  *  P(U  <L  )  .  Hence, 

x  y  y  x 

each  equals  a/ 2  .  We  need  only  compute  the  first  or  se¬ 
cond  term  of  (2.2)  and  multiply  by  2  to  obtain  a  . 

In  the  one-sided  situation,  we  reject  Hq  :  a  =  0  in 

favor  of  H  :  A  >  0  (A  <  0)  if  U  <  L  (U  <  L  )  . 

a  x  y  y  x 

Thus,  the  exact  size  of  the  one-sided  test  is  given  by 

either  term.  For  a  table  which  provides  values  for 

(d  ,d  )  for  various  low  sanrole  sizes  (m,n)  that  yield 
x  y 

useful  one-sample  confience  coefficients  (y  ,y  )  cor- 

x  y 

responding  to  a  desirable  confidence  coefficient  y  =  1  -  a 
for  the  two-sample  interval,  see  Tableman  (1984,  Table  1) 

For  sample  sizes  (m,n)  not  found  in  the  table,  one 


can  use  the  normal  approximation  (1.6).  To  approximate 


mmmmmsm 

-  10  - 


WWWJW » W*l  *UWWAPI*JI v w M 


the  size,  compute 


v  *  (d  -m/2-.5)/(m1/2/2)  ,  v  =  (d  -n/2-.5) /n1/2/2) 
xx  y  y 


and  evaluate  <fr(  • )  at 


v  =  (n/ (n+m) )  1/,2v  +  (m/ (n+m) )  1/^2v  . 

x  y 


Multiply  by  2  for  the  two-sided  test.  For  a  second-order 
approximation  of  the  size,  which  improves  the  normal  ap¬ 
proximation,  see  Tableman  (1984,  p.  28). 


I 


3.  A  SUM  STATISTIC 


In  this  section  we  present  an  equivalent  formulation 
of  the  test  procedure  (1.4)  in  terms  of  a  sum  statistic, 
and  obtain  this  statistic's  null  distribution.  As  will  be 
seen  in  the  next  section,  this  form  enables  us  to  con¬ 
sider  the  problem  of  large  deviations  for  use  in  stocha¬ 
stic  comparisons  (in  the  Bahadur  sense) ,  and  facilitates 
the  task  of  obtaining  Bahadur  slopes. 

We  first  consider  the  one-sided  situation.  To  test 

H  :  A  *  0  versus  H.  :  A  >  0  ,  we  reject  H  if 
o  A  o 

Ux  <  Ly  .  Now, 

X(m-dx.l)  <  Y(dy)  t£  and  °nlY  if 
m 

Z  I{X.  <  Y, .  .  }  >  m  -  dv  +  1 
i=l  1  'ay  ~ 

where  I{A}  is  the  indicator  function  of  the  event  A  . 
Let 


W 


m 

I  I{X. 
i=l  1 


‘  Y<v’ 


(3.1) 


Then,  we  reject  Hq  if  Sx(d^)  >  m  -  dx  +  1  .  The  next 
theorem  gives  the  null  distribution  of  S  (d  )  . 


Theorem  3.1.  Under  HQ  :  A  =  0  ,  the  probability  distri¬ 
bution  function  of  Sx(dy)  is  given  by 


P(Sx(dy)  =  t) 


<?  > 

_ Z_ 

.m+n  . 

dy+t 


(dy+t)  ' 


t  =  0,1, 


,m 


(3.2) 


Proof.  Under  HQ  we  may  represent  the  probability  space 

by  a  simple  urn  model  with  m  x's  and  n  y's  .  We 

draw  the  x's  and  y's  out  of  the  urn  one  at  a  time 

without  replacement.  Then  the  P(S  (d  )  *  t)  is  the  pro- 

x  y 

bability  that  after  d  -  1  +  t  draws  we  have  t  x's 
and  (dy-1)  y's  and  on  the  next  draw  we  obtain  a  y  . 
Hence 


P(Sx(dy)  =  t) 


(ra)  (n  ) 
,m+n  . 

lVt-1 


m+n-dy-t+l 


After  some  algebraic  manipulation,  expression  (3.2)  is 
obtained.  | 


This  probability  distribution  function  previously  appeared 
in  (2.2) . 


We  note  that  this  distribution  is  not  symmetric.  If 


-  13  - 

Y(d  )  were  replaced  by  the  median  of  the  Y  sample,  the 

statistic  defined  in  (3.1)  would  be  Mathisen's  (1943) 

m 

test  statistic  Z  I{X.  <  med  y.}  .  When  n  = 

i=l  j«l,...,n  3 

2k  -  1  ,  the  distribution  of  S  (d  )  is  symmetric  if 

x  y 

and  only  if  dv  =  k  .  When  n  =  2k  ,  there  is  no  integer 

d  for  which  S  (d  )  has  a  symmetric  distribution, 

y  x  y 

Our  final  observation  is  stated  as  a  corollary  to 
Theorem  3.1. 

Corollary  3.1. 

p(Ux<Ly)  -  p(Uy<Lx)  .  (3.3) 


Proof.  Now,  U  <  L  iff  S  (d  )  >  m  -  d  +  1  .  Further, 

y  <»v  y  x 


m 


uy  "  LX  1£f  >  Y(n-ay+l)}  >  m  -  ax  +  1  .  A„  ar- 


gument  similar  to  that  given  in  the  proof  of  (3.2)  to¬ 


gether  with  }  *  0  gives 


m 


(?  <S  > 


JC _ iL 


P{iflI{Xi  >  Y(n-dy+l)}  *  t}  =  (m+nj  ’  (dy+t)  * 


d  +t' 

y 


The  result  follows. 
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4.  A  LARGE  DEVIATIONS  RESULT  AND  BAHADUR  EFFICIENCY 

Briefly,  Bahadur  (1967)  efficiency  is  a  comparison 
of  the  rates  (called  Bahadur  slopes)  at  which  the  Type  I 
error  probabilities  of  two  test  procedures  tend  to  zero 
while  the  Type  II  error  probabilities  remain  fixed  at  (or  tend 
to)  a  0(A)  ,  0  <  6(A)  <  1  ,  for  fixed  A  .  An  alternative 
formulation  is  in  terms  of  (asymptotically)  fixed-width 
confidence  intervals.  That  is,  we  compare  the  rates  at 
which  the  Type  I  error  probabilities  tend  to  zero  while 
the  lengths  remain  fixed  at  (or  tend  to)  a  positive  con¬ 
stant  L  *  2a  not  depending  on  A  .  Such  a  formulation 
was  first  considered  by  Serfling  and  Wackerly  (1976)  for 
use  in  the  construction  and  analysis  of  sequential  con¬ 
fidence  interval  procedures. 

Remark  1.  The  equivalence  between  the  two  formulations 
is  seen  in  the  following  example:  In  the  one-sample  set¬ 
ting,  consider  the  interval  centered  at  the  sample  mean 
for  the  location  parameter  0  ,  i.e.  Im  =  [ Xm ± a]  ,  a  >  0  . 
For  the  sequence  of  intervals  {Im>  •  define  the  associated 

sequence  of  tests  of  H  :  0  -  0  versus  H.  :  0  =  a 

o  A 

(or  -a)  by  the  rejection  rule,  reject  HQ  if  0  £  I  . 

It  is  easily  seen  that  the  Type  I  error  probability, 

2a  *  P{0  £  I  }  ,  tends  to  zero.  In  addition,  note  that 
m  m 

the  probability  of  a  Type  II  error  (covering  0  when  a 


f 


fit." 


yyyrrir/: 
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or  -a  attains)  tends  to  1/2  ,  which  suffices  to  make 


the  stochastic  comparison.  In  general,  let  Bm  represent 


the  sequence  of  Type  II  error  probabilities.  As  long  as 


0m  tends  to  some  quantity  6  ,  0  <  8  <  1  ,  then  if 


-log  am/n»  converges,  it  converges  to  1/2  of  the  Baha¬ 


dur  slope.  (See  Serfling,  1980,  §  10.4.2.) 


Since  the  length  of  the  two-sample  interval  (1.5) 


is  simply  the  sum  of  the  lengths  of  the  two  one-sample 


intervals,  the  strategy  we  take  is  to  first  build  a  fixed- 


width  two-sample  interval  from  two  fixed-width  one-sample 


intervals,  then  use  the  sum  statistic  formulation  of  the 


test  (3.1)  to  obtain  the  rate  at  which  the  Type  I  error 


(or  equivalently  the  noncoverage)  probability  tends  to 


zero.  For  ease  of  discussion  we  assume  F  is  symmetric 


about  zero.  We  also  assume  that  F  satisfies  assumption 


(1.1)  with 


b  or  a  ,  b  >  0  and  a  >  0  . 


Consider  the  confidence  interval  (1.2)  for  8^  .  De¬ 


fine  the  depths  as  follows: 


d(m)  =  m(l/2-  <px)  ,  u(m)  =  m  -  d(m)  +  1  (4.1) 


where  «  =  F  ( 0  +b)  -  1/2  ,  b  >0  (see  Figure  1) .  By 

x  a..  X 


symmetry  then, 
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1/2  +  ®  =  f  (8  +b)  =  F(b)  and 

X  o  x 

(4.2) 

1/2  -  <px  =  F0  <ex-b)  =  F  ( -b)  . 

Therefore,  by  construction,  8  -  b  and  8  +  b  cor- 

X  X 

respond  to  lower  and  upper  (1/2  -9  )  quantiles,  re¬ 
spectively,  of  the  distribution  F„  (x)  .  Similarly  define 

r 

the  depths  for  the  endpoints  of  the  confidence  interval 
for  8^  ,  with 

d(n)  -  n(l/2-<pv)  ,  u(n)  =  n  -  d(n)  +  1  (4.3) 

where  «  *  Fft  (8  +a)  -  1/2  *  F(a)  -  1/2  ,  a  >  0  . 

y  0y  y 

With  the  depths  so  defined  we  can  appeal  to  Bahadur's 
almost  sure  representation  of  the  central  order  statistic. 
(See  Serfling,  1980,  p.  93.)  We  state  this  representation 
for  the  endpoints  x(d(B))  ,  X(u(m))  . 

With  probability  1  , 

x<d(m) )  *  9x  -  b  +  '  V9x-b>  ]/f(b»  + 

(4.4) 

x(u<m) )  '  9x  +  b  +  '(1/2+»x»  -  Fm(9x+b)  l/£lb>  +  °(m'1/2> 


where  F^  is  the  empirical  distribution  function.  Let 


,  A  ,  and  A  denote  the  lengths  of  the  intervals 
m  n  m,n 

(1.2),  (Ly  ,  U^]  ,  (1.5)  respectively  with  depths  defined 
as  in  (4.1,  4.3).  Then  it  immediately  follows  that  as 
m,n  -*>  ®  ,  with  probability  1 

A_  -*•  2b  ,  A_  -*■  2a  ,  and  A  ■+•  2a  +  2b  .  (4.5) 

m  n  m,  n 

The  Type  I  error  probability  of  the  two-sample  test 
(1.4)  is  given  by 

2am,n  =  Po{X(u(m))  <  Y(d(n))}  +  Po{Y(u(n))  <  X(d(m)) 
“  2Po{X(u(m))  <  Y(d(n)>} 

where  the  last  equality  follows  from  the  symmetry  estab¬ 
lished  in  Corollary  3.1.  It  follows  from  the  sum  sta¬ 
tistic  formulation  of  the  test  (3.1)  that 

“m,n  *  Po{Sx<d<n>>  £  m  **  d(n»)  +  1}  (4.6) 

where  the  null  distribution  of  Sx(d(n))  is  given  in 
Theorem  3.1.  Suppose  that  m,n  •+  «  so  that  m/  (m+n)  -*■  X 
0  <  X  <  1  .  Then  (by  a  straightforward  argument)  under 
A  =  0  , 

Sx(d(n) )  /  (m+n)  -*• 


XF(-a)  in  probability 
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and  from  (4.1) 

(m-d(m)+l)/ (m+n)  -►  XF(b)  >  XF(-a) 

since  both  a  and  b  are  positive.  Therefore 

a  -*•  0  as  m,n  -*■  »  . 
m,n 

The  following  lemma  establishes  the  probability  of  large 
deviations  for  the  sum  statistic  Sx(d(n))  .  The  proof  is 
given  in  the  appendix. 


Lemma  4.1.  Assume  m/N  ♦  X  ,  0  <  X  <  1  ,  N*m  +  n  ,  as  n,m->®  . 
Without  loss  of  generality,  take  m  a  n  .  Then  for  x 
such  that  X/2  <  x  <  X  ,  with  o  *  1  -  X  , 

lim  N**1logP  {S  (d(n))  $Nx} 

n ,  m-f » 

*  rlogfl-p) /t) + (1-p-t) log( (1-p) / (1-p-t) ) 

+  plog2  -  (p(l-2<py)/2)log(l-2<p  >  -  (p(l+2q>  )/2)log(l+2q>y) 
-  log2  +  (  (2x+p  (l-2<py)  )/2)  log  ( p  ( l-2cpy ) +2t ) 

+  ((2-2t-p(l-2q>y))/2)log(2-2T-p(l-2<p  )  )  , 

where  <py  is  given  in  (4.3). 


The  theorem  that  follows  establishes  that  the  Type  I 
error  probability  of  the  two-sample  test  based  on  the  com¬ 
parison  of  two  fixed-width  one-sample  sign-intervals  con¬ 
verges  to  zero  at  an  exponential  rate.  We  refer  to  this 
rate  as  the  index  of  exponential  convergence  and  denote 
it  by  e(a,b)  as  it  depends  on  the  choices  of  a  and  b 
as  well  as  the  distribution  F  . 

Theorem  4.1.  Under  the  same  assumptions  as  those  given 
in  Lemma  4.1,  for  the  sequence  of  intervals  (1.5)  with 
depths  defined  by(4.1)  and  (4.3),  the  index  of  exponen¬ 
tial  convergence  of  a  (4.6)  is 

m,  n 

-e(a,b)  =  lim  N_1log  a 

m,n 

n,m*» 

=  -(l-p)F(b)logF(b)  -  (1-p)  ( 1-F (b) ) log ( 1-F (b) ) 

+  plog2  -  log2 

-p  (1-F  (a) )  log (2  (1-F  (a) ) )  -pF  (a)  log2F  (a)  (4 . 7) 

+ ( ( 1-p) F (b)  +  p ( 1-F (a) ) ) log ( 2 (1-p ) F (b) +2p ( 1-F (a) ) ) 


+  (l-(l-p)F(b)  -  p (1-F (a) ) ) log(2-2 (l-p)F (b) -2p (1-F (a) ) ) 


Proof.  From  (4.1)  and  (4.2),  we  have 

m  -  d  (m)  +  1  *  N(AF(b)  +o(l))  ,  b  >  0  . 

Let  denote  AF(b)  +  o(l)  ,  and  t  denote  AF(b)  . 

N 

Then 

•»  t  as  n,m  -*>  ®  ,  and 

N 

A/ 2  <  XF(b)  <  X  . 

From  (4.3)  , 

(1-A)  (l-2<py)/2  *  (l-A)F(-a)  *  pF(-a)  . 

Hence,  Lemma  4.1  applies  with  t  replaced  by  XF(b)  . 
After  some  algebraic  manipulation,  the  expression  (4.7) 
is  obtained.  B 

Remark  2 .  Four  interesting  cases  are  the  following: 

(a)  If  a  =  b  ,  the  index  is  symmetric  in  p  and 

1  -  p  ;  (i.e.  in  1-A  and  X  )  . 

(b)  If  a  =  b  and  m  =  n  ,  the  index  reduces  to  the  in 

dex  of  Mood's  test.  (See  Woodworth,  1970.) 
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(c)  If  a  and  b  are  related  via  the  relationship 

XF(b)  +  (l-X)F(-a)  =  1/2  ,  (4.8) 

then  the  index  is  again  the  index  of  Mood's  test. 

(d)  Suppose  that  the  asymptotic  length  of  one  interval 
vanishes,  e.g.  a  =  0  .  Then  the  index  reduces  to 
that  of  Mathisen's  statistic  (Killeen,  et  al.,  1972). 

(e)  If  m  =  n  then  for  a  +  b  =  c  ,  the  index  is 
maximized  by  a  =  b  *  c/2  which  yields  Mood's 
statistic.  On  the  other  hand,  the  index  is  a  minimum 
for  a  +  b  ■  c  just  when  a  or  b  is  0  which 
yields  Mathisen's  statistic.  Hence,  for  equal  sample 
sizes  Mood's  test  is  best  and  Mathisen's  test  is  worst. 
However,  for  more  extreme  sample  size  ratios,  Mathisen's 
test  has  a  larger  index  than  Mood's  test;  (see  Killeen, 
et  al. ,  1972) . 

These  remarks  are  crucial  in  that  they  show  the  intricate 
relationship  of  the  special  Mood  and  Mathisen-intervals  to 
that  of  the  general  two-sample  interval  constructed  from  two 
arbitrarily  chosen  (asymptotically)  fixed-width  sign-intervals 
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5.  NUMERICAL  COMPARISONS  AND  DISCUSSION 


Thus,  various  median  tests  arise  as  special  cases 
as  a  result  of  formulating  the  problem  in  terms  of 
(asymptotically)  fixed-width  intervals.  In  this  context 
we  are  able  to  distinguish  between  the  two-sample  test 
based  on  the  Mood-interval  and  any  other  solution  to  the 
condition  (1.7). 

In  order  to  make  efficiency  comparisons  we  specify 
a  constant  c  >  0  and  then  consider  values  a  and  b 
such  that  a  +  b  =  c  with  specified  ratio  a/b  .  For  the 
Mood-interval,  however,  we  are  not  free  to  do  this.  The 
relationship  (4.8)  in  terms  of  c  is 
XF(b)  +  (l-A)F(b-c)  =  1/2  .  Once  c  is  specified,  b 
and  hence  a  are  determined  by  this  additional  constraint 
The  (Bahadur)  asymptotic  efficiency  as  m,n  -*■  <®  (with 
m/  (m+n)  -*■  A)  of  Procedure  A  relative  to  Procedure  8 
is  then 

eff(A,8)  =  index (A) /index (8)  . 

Table  1  provides  numerical  evaluation  of  the  indices  of 
exponential  convergence.  We  select  values  of  1/2  ,  1/4  , 
1/8  for  p  =  1  -  A  ;  and  values  of  1,2/3,  and  3/2 
for  the  ratio  a/b  .  Without  loss  of  generality,  we  take 


*  « ‘'o'. 
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(a,p)  to  correspond  to  the  interval  formed  on  the  Y-sample. 
Evaluation  of  the  indices  is  done  at  the  standard  normal 
distribution.  For  tables  with  indices  evaluated  at  the 
logistic  and  Laplace  distributions  see  Tableman  (1984). 

These  tables  reveal  similar  information  and  thus  are  omit¬ 
ted.  Figure  2  supplies  a  graphical  display  of  the  effi¬ 
ciencies  of  the  equal  asymptotic  lengths  (a  =  b)  solution 
relative  to  the  Mood-interval. 

Based  on  the  information  displayed  in  the  table  and 
figure ,  and  with  economic  considerations  in  mind,  we  re¬ 
commend  taking  a  *  b  for  a  specified  c  .  For  if  obser¬ 
vations  from  each  population  are  equal  in  cost,  selecting 
equal  sample  sizes  yields  the  more  efficient  procedure  (as 
always) .  (From  Remark  2  (b) ,  this  solution  is  asymptotically 
equal  to  the  Mood  procedure.)  On  the  other  hand,  if  one  po¬ 
pulation  is  more  expensive  to  sample  from  than  the  other, 
then  taking  two  sign-intervals  with  equal  asymptotic  lengths 
will  provide  the  more  efficient  procedure  for  more  extreme 
values  of  p  ;  and,  as  was  noted  in  Remark  2  (a) ,  the  index 
is  symmetric  in  p  and  (1-p)  .  Therefore,  an  experimenter 
can  adjust  the  ratio  of  sample  sizes  to  meet  cost  con¬ 
straints  (for  example) ,  pick  a  =  b  ,  and  obtain  a  more 
(Bahadur)  efficient  procedure  than  if  he  had  chosen  the 
Mood-interval  procedure. 


Table  1.  Index  of  exponential  convergence  xlO^  : 
Standard  normal  c.d.f. 


c 


p 

a/b 

.01 

.1 

1 

2 

4 

Mood  1/1 

.008 

.795 

75.2 

256 

585 

1/2 

2/3 

.008 

.795 

74.9 

252 

562 

3/2 

.008 

.795 

74.9 

252 

562 

Mood 

.006 

.596 

53.9 

155 

215 

(b-)* 

.0025 

.025 

.234 

.383 

.431 

1/4 

1/1 

.006 

.597 

56.8 

196 

466 

2/3 

.006 

.597 

57.1 

199 

465 

3/2 

.006 

.596 

56.0 

188 

430 

Mood 

.0035 

.348 

29.9 

76.1 

95.5 

(b=) 

.00125 

.0125 

.1122 

.168 

.  18 

1/8 

1/1 

.0035 

.348 

33.4 

118 

300 

2/3 

.0035 

.348 

33.8 

122 

309 

3/2 

.0035 

.348 

32.8 

111 

267 

I 


1  2  3  4  c 


Figure  2.  Bahadur  efficiencies  of  equal  asymptotic  lengths 
(a=b)  solution  with  respect  to  Mood-interval 
evaluated  at  the  standard  normal. 
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APPENDIX 


Proof  of  Lemma  4.1.  We  show  that  conditions  of  Theorem 
2.2  of  Killeen,  et  al.  (1972)  are  satisfied.  Let  [x] 
denote  the  greatest  integer  <  x  .  From  Theorem  3.1, 

limN_1logP  {Sx(d(n)>  =  [Nx]} 
m,n-*« 

-  11m  N"1log([»1)  +  li»  -11m  «_1l°=,<d(n) ) ) 


+  limN_1log(d(n)/(d(n)+tNx]))  . 

(1)  With  d(n)  defined  by  (4.3)  , 

d(n)/  (d(n)+[N t])  -  ( (1-X)  (l-2q»y)/2  )  /((1-X)  (l-2q>y) /2+x )  . 
Therefore,  lim  N  ^log(d (n) / (d (n)+[Nx ] ) )  =  0  . 

(2)  In  the  next  three  steps,  we  use  the  following: 

If  lim  a/n  =  a  ,  lim  b/n  =  B,0<B<a<»  where 
n-*-°°  n-*-<=° 

a,b  are  integers,  then  it  follows  from  Sterling's 
formula  that 

lim  n  ^log(®)  =  Slog(a/B)  +  (a-B) log (a/ (a-B) )  . 
n-*00 

(3)  m/N  X  ,  (Nx  ]  /N-^x;  and  by  assumption,  0  <  x  <  X  . 
Therefore,  by  (2) 


»  r-*  •  j  ’  j  ‘ *  rj»  *>  '  • 


. 


lUn  N“1log<[tJJ  j)  -  Tlogtt/t)  +  (X-r)log(X/(X-T) )  . 

(4)  n/N  -  (1-X)  ;  by  (4.3) , 

d(n)/N  -*>  (1-X)  (l-2<py)/2  <  (1-X)  . 

Therefore,  by  (2) 

lira  N^log^j )  =  plog2  -  (p(l-29y)/2)  log(l-29y) 

-  (p(l+29y)/2)log(l+29y) 
where  p  *  1  -  X  . 

(5)  N/N  -  1  }  (d(n)  +  [Nx])/N  -►  (1-X) (l-29y)/2  +  t  <  1 
Therefore,  by  (2)  and  after  some  algebra 

-lbnN  log(d(n)+lNT]) 

*  -log2  +  ((2t+p(l-29y))/2)log(p(l-2<py)-»-2T) 

+  ( (2-2r-p  (l-29y) )  /2)  log(2-2x-p  (l-29y) )  . 

Summing  up  (1),  (3),  (4),  and  (5),  we  obtain 


a  the  expression  stated  in  Lemma. 

This  along  with  the  fact  that 

lim  N~^logP0{Sx(d(n) )  >  expN^2}  =  -«  implies  Condition 
2.2  (of  Theorem  2.2)  is  satisfied.  Now, 

PQ{Sx(d(n))  -  [Nt]  +  l}/PQ{Sx(d(n))  *  [Nt ] } 

-  ( (m-[Nr  ] ) / ([Nt  1+1) )  ( (d (n) +[Nt ) ) /  (N-d (n) -[Nt ] ) ) 

♦  (  (A-t)  /t)  (((1-A)  (l-2<py)  /2+t)  /  (1- (l-29y)  (1-  A)  /2-t))  as  m,n  -  » 

which  is  positive  and  finite. 

Therefore, 


N“1log(P0{Sx(d(n))-[NT]  +  l}/P0{Sx(d(n))  =  [Nr]})  -  0  as  m,n-»  . 
Condition  2.1  is  satisfied. 

To  check  the  non-increasing  property:  Let  *  >  *N  "  NT  * 
Since  A/2  <  r  <  A  ,  we  only  need  to  check  for  x  such 


that 
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Now, 

P{Sx(d(n))  *  [x]  +  l}/P{Sx(d(n))  «  t x ] } 

-  ( (m-[x] ) / [ x+1 ]) ((d(n)+[xl)/ (N-d (n) -[ x] ) )  . 

Need  to  show  that  for  sufficiently  large  N  ,  this  ratio 
is  less  than  1  .  This  follows  immediately  from  the  fact 
that 

A(l-2q>y)/2  <  A/2 

and  that  A/2  <  t  <  A  .  Therefore,  by  Theorem  2.2  of 
Killeen,  et.  al., 

lim  N-1logPQ{Sx(d(n) )  >Nx}  -  lim  N_1logP0{Sx(d(n) )  =  [Nt]}  . 
n,m^»  n,m*» 
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